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Abstract 

We give a criterion of the form Q(d)c(M) < 1 for the non-reconstructability 
of tree-indexed g-state Markov chains obtained by broadcasting a signal from the 
root with a given transition matrix M. Here c(M) is an explicit function, which 
is convex over the set of M's with a given invariant distribution, that is defined in 
terms of a q — 1-dimensional variational problem over symmetric entropies. Further 
Q(d) is the expected number of offspring on the Galton- Watson tree. 

This result is equivalent to proving the extremality of the free boundary condi- 
tion Gibbs measure within the corresponding Gibbs-simplex. 

Our theorem holds for possibly non-reversible M and its proof is based on a gen- 
eral recursion formula for expectations of a symmetrized relative entropy function, 
which invites their use as a Lyapunov function. In the case of the Potts model, the 
present theorem reproduces earlier results of the authors, with a simplified proof, in 
the case of the symmetric Ising model (where the argument becomes similar to the 
approach of Pemantle and Peres) the method produces the correct reconstruction 
threshold) , in the case of the (strongly) asymmetric Ising model where the Kesten- 
Stigum bound is known to be not sharp the method provides improved numerical 
bounds. 

AMS 2000 subject classification: 60K35, 82B20, 82B44 

Keywords: Broadcasting on trees, Gibbs measures, random tree, Galton- Watson 
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1 Introduction 

The problem of reconstruction of Markov chains on d-ary trees has enjoyed much interest 
in recent years. There are multiple reasons for this, one of them being that it is a 
topic where people from information theory, researchers in mathematical statistical 
mechanics, pure probabilists, and people from the theoretical physics side of statistical 
mechanics can meet and make contributions. 

Indeed, starting with the symmetric Ising channel for which the reconstruction 
threshold was settled in (HO [9], using different methods and increased generality w.r.t. 
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the underlying tree, there have been publications by a.o. Borgs, Chayes, Janson, Mos- 
sel, Peres from the mathematics side |12| ,ll3t [TT | [2]. deriving upper and lower bounds on 
reconstruction threshold for certain models of finite state tree-indexed Markov chains. 
From the theoretical physics side let us highlight [16J on trees (see also [7] on graphs) 
which contains a discussion of the potential relevance of the reconstruction problem also 
to the glass problem. That paper also provides numerical values for the Potts model on 
the basis of extensive simulation results. The Potts model is interesting because, unlike 
the Ising model, the true reconstruction threshold behaves (respectively is expected to 
behave) non-trivially as a function of the degree d of the underlying d-ary tree and the 
number of states q. For a discussion of this see the conjectures in [16] . the rigorous 
bounds in [5], and in particular the proof in [T7j showing that the Kesten-Stigum bound 
is not sharp if q > 5, and sharp if q = 3, for large enough d. We refer to [1] for a 
general computational method to obtain non trivial rigorous bounds for reconstruction 
on trees. 

Now, our treatment is motivated in the generality of its setup by the questions raised 
and type of results given in [15], and technically somewhat inspired by [14} [5], Indeed, 
for the Potts model the present paper reproduces the result of [5] (where moreover we 
provide numerical estimates on the reconstruction inverse temperature, also in the small 
q, small d regime.) However, in the present paper the focus is on generality, that is the 
universality of the type of estimate, and the structural clarity of the proof. It should 
be clear that the condition we provide can be easily implemented in any given model to 
produce numerical estimates on reconstruction thresholds. 

The remainder of the paper is organized as follows. Section [2] contains the definition 
of the model and the statement of the theorem. Section [3] contains the proof. 



2 Model and result 

Consider an infinite rooted tree T having no leaves. For v, w 6 T we write v —> w, if w 
is the child of v, and we denote by \v\ the distance of a vertex v to the root. We write 
T N for the subtree of all vertices with distance < N to the root. 

To each vertex v there is associated a (spin-) variable cr(v) taking values in a finite 
space which, without loss of generality, will be denoted by {1,2,..., q}. Our model will 
be defined in terms of the stochastic matrix with non-zero entries 

M = (M(v,w))i< VtW < q . (1) 

By the Perron-Frobenius theorem there is a unique single-site measure a = {ot(j)) j=i q 
which is invariant under the application of the transition matrix M, meaning that 

The object of our study is the corresponding tree-indexed Markov chain in equilib- 
rium. This is the probability distribution P on {1, . . . , q} T whose restrictions ¥ t n to 
the state spaces of finite trees {1, . . . , q} TN are given by 

F t n{<t t n) = a(a(0)) J] M(a(v),a(w)) . (2) 

v — >w 

The notion equilibrium refers to the fact that all single-site marginals are given by the 
invariant measure a. 
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A probability measure [i on {1, 2, . . . , q} T is called a Gibbs measure if it has the same 
finite-volume conditional probabilities as P has. This means that, for all finite subsets 
V C T, we have for all N sufficiently large 



(3) 



^-almost surely. The Gibbs measures, being defined in terms of a linear equation, 
form a simplex, and we would like to understand its structure, and exhibit its extremal 
elements [6] . Multiple Gibbs measures (phase transitions) may occur if the loss of mem- 
ory in the transition described by M is small enough compared to the proliferation of 
offspring along the tree T. Uniqueness of the Gibbs measure trivially implies extremal- 
ity of the measure P, but interestingly the converse is not true. Parametrizing M by 
a temperature-like parameter may lead to two different transition temperatures, one 
where P becomes extremal and one where the Gibbs measure becomes unique. Broadly 
speaking, statistical mechanics models with two transition temperatures are peculiar to 
trees (and more generally to models indexed by non-amenable graphs [10J). This is one 
of the reasons for our interest in models on trees. 

Now, our present aim is to provide a general criterion, depending on the model 
only in a local (finite-dimensional) way, which implies the extremality of P, and which 
works also in regimes of non-uniqueness. People with statistical mechanics background 
may think of it as an analogy to Dobrushin's theorem saying that 0(7) < 1 implies the 
uniqueness of the Gibbs measure of a local specification 7 where 0(7) is determined in 
terms of local (single-site) quantities. 

In fact, Martinelli et al. [15] (see Theorem 9.3., see also Theorem 9.3.' and Theorem 
9.3") give such a theorem. Their criterion for non-reconstruction of Markov chains on 
ci-ary trees has the form (IX2H < 1 where k is the Dobrushin constant [6] of the system 
of conditional probabilities described by P. Further A2 is the second eigenvalue of M. 
Our theorem takes a different form. Now, to formulate our result we need the following 
notation. 

We write for the simplex of length-g probability vectors 



and we denote the relative entropy between probability vectors p, a € P by S(p\a) = 



While the symmetrized entropy is not a metric (since the triangle inequality fails) it 
serves us as a "distance" to the invariant measure a. 

Let us define the constant, depending solely on the transition matrix M, in terms 
of the following supremum over probability vectors 
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P = {{ P {i))i=l 



,.,„?(») >0Vi, £>(i) = l} 



(4) 



i=i 




(5) 



c(M) = sup 




(6) 



pgp L{p) 
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where M rcv (i,j) = "^^^'^ is the transition matrix of the reversed chain. Note that 
numerator and denominator vanish when we take for p the invariant distribution a. Con- 
sider a Galton- Watson tree with i.i.d. offspring distribution concentrated on {1, 2, . . . } 
and denote the corresponding expected number of offspring by Q(d). 
Here is our main result. 

Theorem 2.1 If Q(d)c(M) < 1 then the tree-indexed Markov chain P on the Galton- 
Watson tree T is extremal for Q- almost every tree T. Equivalently, in information 
theoretic language, there is no reconstruction. 

Remark 1. The computation of the constant c(M) for a given transition matrix 
M is a simple numerical task. Note that fast mixing of the Markov chain corresponds 
to small c(M). In this sense c(M) is an effective quantity depending on the interaction 
M that parallels the definition the Dobrushin constant 00(7) in the theory of Gibbs 
measures measuring the degree of dependence in a local specification. While the latter 
depends on the structure of the interaction graph, this is not the case for c(M). 

Remark 2. Non- uniqueness of the Gibbs measures corresponds to the existence 
of boundary conditions which will cause the corresponding finite-volume conditional 
probabilities to converge to different limits. Extremality of the measure P means that 
conditioning the measure P to acquire a configuration £ at a distance larger than N will 
cease to have an influence on the state at the root if £ is chosen according to the measure 
P itself and iV is tending to infinity. In the language of information theory this is called 
non-reconstructability (of the state at the origin on the basis of noisy observations far 
away) . 

Remark 3. (On irreversibility.) If M is any transition matrix reversible for 
the equidistribution and, for a permutation tt of the numbers from I to q, we define 
= M(z,7r _1 j), then c(M) = c(M 7r ) for all permutations tt. This is seen by a 
simple computation. We can say that an irreversibility in the Markov chain which is 
caused by a deterministic stepwise renaming of labels (by tt) is not seen in the constant. 

Remark 4. (On Convexity.) For all fixed probability vectors a the function 
M i->- c(M) is convex on the set of transition matrices which have a as their invariant 
distribution, i.e. aM = a. 

This is a consequence of the fact that, for M\,Mi with aM\ = a, aM2 = a we 
have that (XM 1 + (1 - A)M 2 ) rcv = \Mf cv + (I — A)Mf v and that the relative entropy 
is convex in the first and second argument. 

This implies that, for each a, and fixed degree d, {M,aM = a;dc(M) < 1}, for 
which the criterion ensures non-reconstruction, is convex. 

We conclude this introduction with the discussion of two main types of test-examples. 

Example 1 (Symmetric Potts and Ising model.) The Potts model with q 
states at inverse temperature /3 is defined by the transition matrix 

(e 2 ? I 1 ... 1\ 

M R = 



e 2 P+q-l 



I e 2 ? 1 ... I 



(7) 



I I e 2 ? ... 1 

V ••• / 

This Markov chain is reversible for the equidistribution. In the case q = 2, the Ising 
model, one computes c(Mg) = (tanh/3) 2 which yields the correct reconstruction thresh- 
old. 
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Theorem 12.11 is a generalization of the main result given in our paper [5] for the 
specific case of the Potts model. That paper also contains comparisons of numerical 
values to the (presumed) exact transition values. Our discussion of the cases of q = 3, 4, 5 
shows closeness up to a few percent, and for q = 3, 4 and small d these are the best 
rigorous bounds as of today. To see this connection between the present paper and [5] we 
rewrite c(Mp) = e2 e )3 ^_~^ 1 c(/3, q) and note that the main Theorem of [5] was formulated 
in terms of the quantity 



Numerical Example 2 (Non-symmetric Ising model.) Consider the following 
transition matrix 

M=(]Z 6 s l f) with 6 1 ,6 2 e [0,1]. (9) 

The chain is not symmetric when 1 — 5\ ^ 62 ■ Let us focus on regular trees. Mossel 
and Peres in [13] prove that, on a regular tree of degree d the reconstruction problem 
defined by the matrix Q is unsolvable when 

d {52 - 5l)2 < 1, (10) 

min{5i + 5 2 , 2 - 5i - 5 2 } 

while Martin in [3] gives the following condition for non-reconstructibility 

2 



d(V(l-«*i)<*2-\/(l-<$2)<Si) <1- (11) 



By the Kesten-Stigum bound it is known that there is reconstruction when d{52 — 
5\) 2 > 1. When 61 + 62 = 1, the matrix M is symmetric and the Kesten-Stigum bound 
is sharp. Recently, Borgs, Chayes, Mossel and Roch in [2] have shown with an elegant 
proof that the Kesten-Stigum threshold is tight for roughly symmetric binary channels; 
i.e. when |1 — (5% +#2)! < 5, for some 6 small. Even if the threshold we give is very close 
to Kesten-Stigum bound when the chain has a small asymmetry, by now, we are not 
able to recover this sharp estimate with our method. For large asymmetry the Kesten- 
Stigum bound has been proved to not hold: Mossel proves as Theorem 1 in [12] that, 
for any A > g there exists a 6(X) such that there is reconstruction for 61,62 = X + 61 
when 61 < 6. On a Cayley tree with coordination number d, non-reconstruction for 
the Markov chain ([9]) with 62 = (or 1 — 5i = 0) is equivalent to the extremality of 

the Gibbs measure for the hard-core model with activity jtj^ (l^Ti") • Restricted to 
this specific case, Martin proves a better condition than the one obtained taking 62 = 
both in CD]) and in ©. 

Our entropy method provides a better bound than (jlip and considerably improves 
(jlOP for the values of 6\ and 62 giving a strongly asymmetric chain. 



A computation gives 



c(M) = sup y r — . (12) 



/' | ()t ,[_L__4^ 
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It is quite simple to compute numerically the constant c(M); the numerical outputs 
and the comparisons with (jlOp . ([lip and the Kesten-Stigum bound are in table [TJ For 
the particular pairs of values of (61,62) we checked, the Kesten-Stigum upper bound 
on the non-reconstruction thresholds for asymmetric chains are quite close to our lower 
bounds. 



Si 


= 0.3 


KS 


FK 


M 


MP 






Kesten-Stigum 


Formentin-Kiilske 


Martin 


Mossel-Peres 


62 


= 0.1 


0.04 


0.0579 


0.065 


0.1 


62 


= 0.2 


0.01 


0.0125 


0.0134 


0.02 


62 


= 0.4 


0.01 


0.0107 


0.0110 


0.0143 


62 


= 0.5 


0.04 


0.0413 


0.0417 


0.05 


62 


= 0.6 


0.09 


0.0907 


0.0910 


0.1 


62 


= 0.7 


0.16 


0.16 


0.16 


0.16 


62 


= 0.8 


0.25 


0.2525 


0.2534 


0.28 


62 


= 0.9 


0.36 


0.3787 


0.3850 


0.45 



Table 1: For 61 = 0.3, the Kesten-Stigum upper bound on the non-reconstruction 
thresholds for asymmetric chains are very close to ours. 



3 Proof 

We denote by T N the tree rooted at of depth N. The notation indicates the 
sub-tree of T N rooted at v obtained from "looking to the outside" on the tree T N . 
We denote by P^ the measure on T„ with free boundary conditions, or, equivalently 
the Markov chain obtained from broadcasting on the subtree with the root v with the 
same transition kernel, starting in a. We denote by Py^ the correponding measure on 
with boundary condition on dT^ 1 given by £ = (Ci)iedT N - Obviously it is obtained 

by conditioning the free boundary condition measure P^ to take the value £ on the 
boundary. 
We write 

n» = ^=(F N Hv(v) = s)) . (13) 

\ / s=l,...,q 

To control a recursion for these quantities along the tree we find it useful to make 
explicit the following notion. 

Definition 3.1 We call a real-valued function C on P a linear stochastic Lyapunov 
function with center p* if there is a constant c such that 

• £>(p) > Vp G P with equality if and only if p = p* ; 
. E£«)<c£„™E£(7r£). 

Proposition 3.2 Consider a tree-indexed Markov chain P, with transition kernel M(i, j) 
and invariant measure a(i). 
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Then the function 

Hp) = S(p\a) + S(a\p) = £ (p(i) - <*(*)) log (14) 

is a linear stochastic Lyapunov function with center a w.r.t. the measure P /or i/ie 
constant (jSJ). 

Proposition 13.21 immediately follows from the following invariance property of the 
recursion which is the main result of our paper. 

Proposition 3.3 Main Recursion Formula for expected symmetrized entropy. 

|p(de)L(^)= y, JndOH^M™). ( i5) 



Warning: Pointwise, that is for fixed boundary condition, things fail and one has 

E H^M™) (16) 

w.v— >w 

in general. In this sense the proposition should be seen as an invariance property which 
limits the possible behavior of the recursion. 

Proof of Proposition 13.31 We need the measure on boundary configurations at 
distance N from the root on the tree emerging from v which is obtained by conditioning 
the spin in the site v to take the value to be j, namely 

Q^(0 := P> : a ]&T u = £| a(v) = j). (17) 

Then the double expected value w.r.t. to the a priori measure a between boundary 
relative entropies can be written as an expected value w.r.t. P over boundary conditions 
w.r.t. to the open b.c. measure of the symmetrized entropy between the distributions 
at v and a in the following form. 



Lemma 3.4 



j P(dO L(7rJ^ = J a{dx t ) j a(dx 2 ) S(Q^ \Q^) 

symmetric entropy at v boundary entropy 



(18) 



Proof of Lemma I3.4t In the first step we express the relative entropy as an 
expected value 



/j TV j N r ] N 
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Here we have used that, with obvious notations, 



dQv' X2 F v (a(v)=x 2 ,0 dvrf , . 



dP^ vsy P„(ff(«) = x 2 )P«(0 do 
Further we have used that 



log ^ = log ^ (l ' 2) - los ^ fe) ' (21) 



for xi,X2 € {1, . . . , q}. This gives 

j a(d Xl ) j a(dx 2 )S(Q»> x >\Q^) 



A N d N 

'«) / a (dx 2 )^-(x 2 )log^(x 2 ) 
HO / a(d Xl ) j a (dx 2 )^(x 2 )log^(x l ) U>2 ' 



i 

P(^)5(7r^|a) + j F(d£)S(a\n^) 

and finishes the proof of Lemma 13.41 □ 
Let us continue with the proof of the Main Recursion Formula. We need two more 
ingredients formulated in the next two lemmas. The first gives the recursion of the 
probability vectors tt^ in terms of the values tt^ of their children w, which is valid for 
any fixed choice of the boundary condition £. 

Lemma 3.5 Deterministic recursion. 

Nf\_ Hw.v^w Si aft) ) n w CO 

Z^k 1 Iw.v^w l^i aft) W 

or, equivalently: for all pairs of values j, k we have 



i dm 1 * dit" Z^i aft) ^wW 

log -^-0) - log (*) = £ lQ g ^ M(M^ f , • (24) 

aft) n w W 



da KJ ' b da K ' ^ b v 



The proof of this Lemma follows from an elementary computation with conditional 
probabilities and will be omitted here. 

We also need to take into account the forward propagation of the distribution of 
boundary conditions from the parents to the children, formulated in the next lemma. 

Lemma 3.6 Propagation of the boundary measure. 

Qv' j = II T,M(j,i)Q»>\ (25) 

w.v— >w i 
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This statement follows from the definition of the model. Now we are ready to head 
for the Main Recursion Formula. 

We use the second form of the statement of the deterministic recursion Lemma 
equation (|2ip to write the boundary entropy in the form 



s«3f j ior*) = qs- ' E log . m 

w:v—>w a U) W 

Next, substituting the Propagation-of-the-boundary-measure-Lemma 13.61 and (|20j) 
we write 



S(Q^\Q»' k ) = Q»>i £ log^^ 



y- M(j,i) N(i) 
2-ii a(i) n w W 

M(k,i) r*(A 
2^i% a(i) n w W 



J2M(j,l)Q% 1 log. 



M{k,i) „ N 
a(i) 

i a(i) w 



y- M(j,i) N(i) 
a{i) n w W 

V M(k,i) N(i) 
w.v-tw I Z-ti a (i) n w W 



a(k) 



E 



a (j) g ^M^(k) 

a(k) 



using in the last step the definition of the reversed Markov chain. Finally applying the 
sum Ylj k a (i) a (^) ' ' ' to both sides of (|27j) we get the Main Recursion Formula. To 
see this, note that the l.h.s. of (j27|) together with this sum becomes the r.h.s. of the 
equation in Lemma l3.4i For the r.h.s. of (|27p we note that 

N 7rffA/ rcv (j) 

J.fe U; op-] 

This finishes the proof of the Main Recursion Formula Proposition 13.31 □ 

Finally, Theorem 12.11 follows from Proposition 13.21 with the aid of the Wald equality 
with respect to the expectation over Galton- Watson trees since the contraction of the 
recursion and the Lyapunov function properties yield 

hm p(f : ir N 't(s) - a{s) > ej -> 0, (29) 

for all s, for all e > 0, and this implies the extremality of the measure P. This ends the 
proof of Theorem 12.11 □ 
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